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Japan Patent Office is not responsible for any 
damages caused by the use of this translation. 

l.This document has been translated by computer. So the translation may not reflect the original precisely. 
2 **** s hows the word which can not be translated. 
3. In the drawings, any words are not translated. 



CLAIMS 
[Claim(s)] 

[Claim 1] A character reader which performs character recognition of image data which reads a manuscript and is 
obtained characterized by providing the following Two or more recognition means to recognize an alphabetic 
character used by each of two or more kinds of language An activation means to perform recognition by said two or 
more recognition means about said image data An acquisition means to acquire a recognition rate of each recognition 
result obtained by recognition by said recognition means A selection means to choose any one of two or more of the 
recognition results based on said two or more recognition rates acquired by said acquisition means, and an output 
means to output a recognition result chosen by said selection means 

[Claim 2] Said selection means is a character reader according to claim 1 which is further equipped with a comparison 
means to compare each recognition rate of two or more of said recognition results, and is characterized by a 
recognition rate choosing highest recognition result as a result of a comparison of said comparison means. 
[Claim 3] Said selection means is a character reader according to claim 1 which is further equipped with a comparison 
means to compare each recognition rate of said recognition result with a predetermined threshold, and is characterized 
by choosing a large recognition result from said predetermined threshold as a result of a comparison of said 
comparison means. 

[Claim 4] A character reader according to claim 1 characterized by having further a separation means to divide said 
image data into a field for every attribute, based on the attribute of image data read by reading means to read said 
manuscript optically, and the aforementioned reading means. 

[Claim 5] Said recognition means is a character reader according to claim 4 characterized by recognizing a field 
whose attribute of a field separated by said separation means is an alphabetic character as image data. 
[Claim 6] A character reader according to claim 4 characterized by having further a division means to divide said 
field in a predetermined unit. 

[Claim 7] Said predetermined unit is a character reader according to claim 6 characterized by including a line and an 
alphabetic character at least. 

[Claim 8] A character reader characterized by providing the following. Two or more recognition means to be the 
character reader which performs character recognition of image data which reads a manuscript and is obtained, and to 
recognize an alphabetic character used by each of two or more kinds of language An activation means to choose one 
of said two or more of the recognition means, and to perform recognition of said image data based on a recognition 
means chosen An acquisition means to acquire a recognition rate of a recognition result obtained according to 
recognition of said recognition means A comparison means compare with a predetermined threshold said recognition 
rate acquired by said acquisition means, an output means output said recognition result as a result of a comparison of 
said comparison means when said recognition rate is larger than said predetermined threshold, and the control means 
that control so that selection of said recognition means is switched one by one and activation of said activation means 
is made until an output by said output means is obtained 

[Claim 9] It is the character reader according to claim 8 which is further equipped with a setting means to set up use 
sequence of said recognition means, and is characterized by switching said switch according to sequence set up with 
said setting means. 

[Claim 10] A character reader according to claim 8 characterized by having further a separation means to divide said 
image data into a field for every attribute, based on the attribute of image data read by reading means to read said 
manuscript optically, and the aforementioned reading means. 

[Claim 11] Said recognition means is a character reader according to claim 10 characterized by recognizing a field 
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whose attribute of a field separated by said separation means is an alphabetic character as image data. 

[Claim 12] A character reader according to claim 10 characterized by having further a division means to divide said 

field in a predetermined unit. 

[Claim 13] Said predetermined unit is a character reader according to claim 12 characterized by including a line and 
an alphabetic character at least. 

[Claim 14] A character recognition method of performing character recognition of image data which reads a 
manuscript and is obtained characterized by providing the following Two or more recognition production processes of 
recognizing an alphabetic character used by each of two or more kinds of language An activation production process 
which performs recognition by said two or more recognition production processes about said image data An 
acquisition production process which acquires a recognition rate of each recognition result obtained by recognition by 
said recognition production process A selection production process which chooses any one of two or more of the 
recognition results based on said two or more recognition rates acquired according to said acquisition production 
process, and an output production process which outputs a recognition result chosen by said selection production 
process 

[Claim 15] Said selection production process is the character recognition method according to claim 14 which is 
further equipped with a comparison production process which compares each recognition rate of two or more of said 
recognition results, and is characterized by a recognition rate choosing highest recognition result as a result of a 
comparison of said comparison production process. 

[Claim 16] Said selection production process is the character recognition method according to claim 14 which is 
further equipped with a comparison production process which compares each recognition rate of said recognition 
result with a predetermined threshold, and is characterized by choosing a large recognition result from said 
predetermined threshold as a result of a comparison of said comparison production process. 
[Claim 17] A character recognition method according to claim 14 characterized by having further a separation 
production process which divides said image data into a field for every attribute based on the attribute of image data 
read according to a reading production process which reads said manuscript optically, and the aforementioned reading 
production process. 

[Claim 18] Said recognition production process is the character recognition method according to claim 17 
characterized by recognizing a field whose attribute of a field separated by said separation means is an alphabetic 
character as image data. 

[Claim 19] A character recognition method according to claim 17 characterized by having further a division 
production process which divides said field in a predetermined unit. 

[Claim 20] Said predetermined unit is the character recognition method according to claim 19 characterized by 
including a line and an alphabetic character at least. 

[Claim 21] A character recognition method characterized by providing the following. Two or more recognition 
production processes of being the character recognition method of performing character recognition of image data 
which reads a manuscript and is obtained, and recognizing an alphabetic character used by each of two or more kinds 
of language An activation production process which chooses one of said two or more of the recognition production 
processes, and performs recognition of said image data based on a recognition production process chosen An 
acquisition production process which acquires a recognition rate of a recognition result obtained according to 
recognition of said recognition production process The comparison production process which compares with a 
predetermined threshold said recognition rate acquired according to said acquisition production process, the output 
production process which output said recognition result as a result of a comparison of said comparison production 
process when said recognition rate is larger than said predetermined threshold, and the control production process 
which control so that selection of said recognition production process is switched one by one and activation of said 
activation production process is made until an output by said output production process is obtained 
[Claim 22] It is the character recognition method according to claim 17 which is further equipped with a setting 
production process which sets up use sequence of said recognition production process, and is characterized by 
switching said switch according to sequence set up at said setting production process. 

[Claim 23] A character recognition method according to claim 21 characterized by having further a separation 
production process which divides said image data into a field for every attribute based on the attribute of image data 
read according to a reading production process which reads said manuscript optically, and the aforementioned reading 
production process. 

[Claim 24] Said recognition production process is the character recognition method according to claim 23 
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characterized by recognizing a field whose attribute of a field separated according to said separation production 
process is an alphabetic character as image data. 

[Claim 25] A character recognition method according to claim 23 characterized by having further a division 
production process which divides said field in a predetermined unit. 

[Claim 26] Said predetermined unit is the character recognition method according to claim 25 characterized by 
including a line and an alphabetic character at least. 

[Claim 27] Computer-readable memory in which a program of character recognition processing characterized by 
providing the following was stored A procedure code of two or more recognition production processes of recognizing 
an alphabetic character used by each of two or more kinds of language A procedure code of an activation production 
process which performs recognition by said two or more recognition production processes about said image data A 
procedure code of an acquisition production process which acquires a recognition rate of each recognition result 
obtained by recognition by said recognition production process A procedure code of a selection production process 
which chooses any one of two or more of the recognition results based on said two or more recognition rates acquired 
according to said acquisition production process, and a procedure code of an output production process which outputs 
a recognition result chosen by said selection production process 

[Claim 28] Computer-readable memory characterized by providing the following. A procedure code of two or more 
recognition production processes of being the computer-readable memory in which a program of character 
recognition processing was stored, and recognizing an alphabetic character used by each of two or more kinds of 
language A procedure code of an activation production process which chooses one of said two or more of the 
recognition production processes, and performs recognition of said image data based on a recognition production 
process chosen A procedure code of an acquisition production process which acquires a recognition rate of a 
recognition result obtained according to recognition of said recognition production process The procedure code of the 
comparison production process which compares with a predetermined threshold said recognition rate acquired 
according to said acquisition production process, the procedure code of the output production process which outputs 
said recognition result as a result of the comparison of said comparison production process when said recognition rate 
is large than said predetermined threshold, and the procedure code of the control production process which controls so 
that selection of said recognition production process switches one by one and activation of said activation production 
process is made until an output by said output production process is obtained 



[Translation done.] 
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• * NOTICES * 

Japan Patent Office is not responsible for any 
damages caused by the use of this translation. 

l.This document has been translated by computer. So the translation may not reflect the original precisely. 
2 **** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 

DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[0001] 

[The technical field to which invention belongs] This invention relates to the character reader which performs 

character recognition of the image data which reads a manuscript and is obtained, and its method. 

[0002] 

[Description of the Prior Art] Electronization of the information to which information management and retrieval 
become easy in the modern society with which information overflows is desired immediately. To informational 
electronization, OCR (optical character recognition) which recognizes the alphabetic character and is changed into a 
character code when especially an image is an alphabetic character is indispensable in the image read with readers, 
such as a scanner, and the precision has been improving rapidly to it. 

[0003] in a reader equipped with OCR, when two or more kinds of alphabetic characters from which language differs 
with the equipment simple substance have been recognized, because of the difference in the property of each 
language, each language was not able to be boiled with a sufficient precision and has not been recognized. For 
example, by OCR only for Japanese, when the alphabet has been recognized, since the property differs from Japanese 
remarkably, especially the small letter of the alphabet has not been recognized. 1 
[0004] Therefore, with one reader, in order to recognize two or more kinds of language, the recognition algorithm 
according to language was prepared, respectively, and accurate character recognition was performed because a user 
switches and uses the recognition algorithm corresponding to the language with an input unit etc. for every language. 
Moreover, a recognition algorithm is that a user changes the dictionary of the language corresponding to the 
alphabetic character made to recognize whenever [ same ] it recognizes by storing the dictionary of each alphabetic 
character in equipment beforehand but from input units, such as an operation panel, and was performing character 
recognition of each language. Furthermore, it also needed to be controlled for changing each dictionary. 
[0005] 

[Problem(s) to be Solved by the Invention] However, the time and effort of the method [ of performing character 
recognition of two or more kinds of language while a user performs the directions which switch the dictionary of each 
language using input units, such as an operation panel, ] of a user increased, and it had the trouble of reducing 
processing speed. Moreover, when there were two or more reading manuscripts, in order to reduce substitution 
actuation of the manuscript, ADF (auto document feeder) was used and character recognition was performed. In that 
case, when the English manuscript and the Japanese manuscript were mixed in two or more reading manuscripts, 
whenever it read one reading manuscript, the user had to be directed and there was spoiling the advantage of ADF and 
a trouble of reducing processing speed as a result. 

[0006] It aims at offering the character reader which this invention is made in view of the above-mentioned trouble, 
and makes possible character recognition of the alphabetic character used for each of two or more language, and 
improves the processing speed of character recognition, and its method. 
[0007] 

[The means for solving invention] The character reader by this invention for attaining the above-mentioned purpose is 
equipped with the following configurations. Namely, two or more recognition means to be the character reader which 
performs character recognition of the image data which reads a manuscript and is obtained, and to recognize the 
alphabetic character used by each of two or more kinds of language, An activation means to perform recognition by 
said two or more recognition means about said image data, An acquisition means to acquire the recognition rate of 
each recognition result obtained by recognition by said recognition means, Based on said two or more recognition 
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rates acquired by said acquisition means, it has a selection means to choose any one of two or more of the recognition 
results, and an output means to output the recognition result chosen by said selection means. 
[0008] Moreover, preferably, said selection means is further equipped with a comparison means to compare each 
recognition rate of two or more of said recognition results, and chooses a recognition result with the highest 
recognition rate as a result of the comparison of said comparison means. It is because accurate character recognition 
can be performed by choosing the recognition result of the highest recognition rate. 

[0009] Moreover, preferably, said selection means is further equipped with a comparison means to compare each 
recognition rate of said recognition result with a predetermined threshold, and chooses a large recognition result from 
said predetermined threshold as a result of the comparison of said comparison means. Moreover, based on the 
attribute of the image data preferably read by reading means to read said manuscript optically, and the 
aforementioned reading means, it has further a separation means to divide said image data into the field for every 
attribute. 

[0010] Moreover, said recognition means recognizes preferably the field whose attribute of the field separated by said 
separation means is an alphabetic character as image data. Moreover, it has further preferably a division means to 
divide said field in a predetermined unit. Moreover, said predetermined unit contains a line and an alphabetic 
character at least preferably. 

[001 1] The character reader by this invention for attaining the above-mentioned purpose is equipped with other 
following configurations. Namely, two or more recognition means to be the character reader which performs 
character recognition of the image data which reads a manuscript and is obtained, and to recognize the alphabetic 
character used by each of two or more kinds of language, An activation means to choose one of said two or more of 
the recognition means, and to perform recognition of said image data based on the recognition means chosen, An 
acquisition means to acquire the recognition rate of the recognition result obtained according to recognition of said 
recognition means, A comparison means to compare with a predetermined threshold said recognition rate acquired by 
said acquisition means, It has an output means to output said recognition result as a result of the comparison of said 
comparison means when said recognition rate is larger than said predetermined threshold, and the control means 
controlled so that selection of said recognition means is switched one by one and activation of said activation means 
is made until the output by said output means is obtained. 

[0012] Moreover, it has further preferably a setting means to set up the use sequence of said recognition means, and 
said switch is switched according to the sequence set up with said setting means. It is because processing speed can be 
improved by switching sequence according to a user's use. Moreover, based on the attribute of the image data 
preferably read by reading means to read said manuscript optically, and the aforementioned reading means, it has 
further a separation means to divide said image data into the field for every attribute. 

[0013] Moreover, said recognition means recognizes preferably the field whose attribute of the field separated by said 
separation means is an alphabetic character as image data. Moreover, it has further preferably a division means to 
divide said field in a predetermined unit Moreover, said predetermined unit contains a line and an alphabetic 
character at least preferably. 

[0014] The character recognition method by this invention for attaining the above-mentioned purpose is equipped 
with the following configurations. Namely, two or more recognition production processes of being the character 
recognition method of performing character recognition of the image data which reads a manuscript and is obtained, 
and recognizing the alphabetic character used by each of two or more kinds of language, The activation production 
process which performs recognition by said two or more recognition production processes about said image data, The 
acquisition production process which acquires the recognition rate of each recognition result obtained by recognition 
by said recognition production process, The character recognition method characterized by having the selection 
production process which chooses any one of two or more of the recognition results, and the output production 
process which outputs the recognition result chosen by said selection production process based on said two or more 
recognition rates acquired according to said acquisition production process. 

[0015] It has other following configurations for the character recognition method by this invention for attaining the 
above-mentioned purpose. Namely, two or more recognition production processes of being the character recognition 
method of performing character recognition of the image data which reads a manuscript and is obtained, and 
recognizing the alphabetic character used by each of two or more kinds of language, The activation production 
process which chooses one of said two or more of the recognition production processes, and performs recognition of 
said image data based on the recognition production process chosen, The acquisition production process which 
acquires the recognition rate of the recognition result obtained according to recognition of said recognition production 
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process, The comparison production process which compares with a predetermined threshold said recognition rate 
acquired according to said acquisition production process, It has the output production process which outputs said 
recognition result as a result of the comparison of said comparison production process when said recognition rate is 
larger than said predetermined threshold, and the control production process controlled so that selection of said 
recognition production process is switched one by one and activation of said activation production process is made 
until the output by said output production process is obtained. 

[0016] The computer-readable memory by this invention for attaining the above-mentioned purpose is equipped with 
the following configurations. Namely, the procedure code of two or more recognition production processes of being 
the computer-readable memory in which the program of character recognition processing was stored, and recognizing 
the alphabetic character used by each of two or more kinds of language, The procedure code of the activation 
production process which performs recognition by said two or more recognition production processes about said 
image data, The procedure code of the acquisition production process which acquires the recognition rate of each 
recognition result obtained by recognition by said recognition production process, Based on said two or more 
recognition rates acquired according to said acquisition production process, it has the procedure code of the selection 
production process which chooses any one of two or more of the recognition results, and the procedure code of the 
output production process which outputs the recognition result chosen by said selection production process. 
[0017] The computer-readable memory by this invention for attaining the above-mentioned purpose is equipped with 
other following configurations. Namely, the procedure code of two or more recognition production processes of being 
the computer-readable memory in which the program of character recognition processing was stored, and recognizing 
the alphabetic character used by each of two or more kinds of language, The procedure code of the activation 
production process which chooses one of said two or more of the recognition production processes, and performs 
recognition of said image data based on the recognition production process chosen, The procedure code of the 
acquisition production process which acquires the recognition rate of the recognition result obtained according to 
recognition of said recognition production process, The procedure code of the comparison production process which 
compares with a predetermined threshold said recognition rate acquired according to said acquisition production 
process, Until the procedure code of the output production process which outputs said recognition result, and the 
output by said output production process are obtained as a result of the comparison of said comparison production 
process, when said recognition rate is larger than said predetermined threshold It has the procedure code of the control 
production process controlled so that selection of said recognition production process is switched one by one and 
activation of said activation production process is made. 
[0018] 

[Embodiment of the Invention] Hereafter, the gestalt of suitable operation of this invention is explained to details with 
reference to a drawing. 

<Gestalt 1 of operation> drawing 1 is the block diagram showing the configuration of the character reader of the 
gestalt 1 of operation. In this drawing, 1 1 1 is CPU and performs various control of RAMI 13, a keyboard 1 14, the 
character recognition section 1 15, a display 116, the Records Department 1 17, and a read station 118 according to the 
program memorized by ROM1 12. 1 12 is ROM and stores the various programs for performing processing of the data 
inputted from a keyboard 1 14, and character recognition of the character recognition section 115. 1 13 is RAM and is 
a shunting field the working area of the data inputted from various programs or a keyboard 114, and temporarily. 
[0019] 1 14 is a keyboard and performs the directions and the entry of data of initiation of character recognition which 
are mentioned later. 1 15 is the character recognition section and performs processing explained with the flow chart of 
drawing 3 mentioned later. 1 16 is a display and displays the recognition result and processing condition of the 
character recognition section 115. 1 17 is the Records Department, is directions by a user's keyboard 1 14, and records 
the recognition result of the character recognition section etc. on a record medium. 

[0020] 1 18 is a read station and is a read station which consists of an optical character reader (OCR) which reads an 
image optically. Moreover, the read image data presupposes that it is binary 1 -pixel 1-bit image made simple binary. 
Furthermore, the read image data is developed by RAMI 1 as a bitmapped image of 1 bitwise, and processing 
explained with the flow chart of drawing 3 mentioned later is performed. 

[0021] In addition, when the inclination has arisen in the image data read by the read station 118, before performing 
character recognition processing, a better result can be obtained by amending the inclination of image data to the 
image data to which the manuscript inclined. What is necessary is to ask for the inclination of the extracted line as the 
amendment method of an inclination, for example, and just to perform coordinate transformation which loses the 
inclination. 
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[0022] It is FDD (floppy disk drive), and 119 equips with FD and R/W of data is possible for it. Moreover, the 
program of the flow chart later mentioned to FD (un-illustrating) with which it was equipped can be written in, and 
processing can also be performed by reading this program into RAMI 12 of this equipment. In addition, it is also 
easily possible to perform processing by reading the program which CD-ROM and HD which are equipped with a 
CD-ROM driver or HDD instead of, and are equipped with or built in each driver were made to memorize an above- 
mentioned program, and was memorized. [ FDD1 19 ] 

[0023] 120 is a CPU bus and connects the component of a character reader mutually. In addition, the reading 
manuscript read by the read station 118 presupposes that it is the manuscript with which Japanese, English or 
Japanese, and English mingled with the gestalt 1 of operation. Next, detailed explanation of the character recognition 
section 1 15 is explained using drawing 2 . Drawing 2 is the block diagram showing the detailed configuration of the 
character recognition section 1 15 of the gestalt 1 of operation. 

[0024] 101 is the recognition unit logging section, and in order to perform recognition of the image data read by the 
read station 1 18 for every recognition unit with predetermined magnitude, it divides image data into two or more 
blocks. In addition, recognition units are the line which divided the alphabetic block which is the assembly of the 
alphabetic character occupied in a predetermined field according to the field separation which mentions for example, 
a reading image later, or the alphabetic block per line, and a certain alphabetic character which was and divided the 
limping gait per alphabetic character. 

[0025] 102 is the recognition section 1 which performs Japanese character recognition, and recognizes the image data 
divided by the recognition unit logging section 101 for every recognition unit. Moreover, as a result of having been 
recognized in the recognition section 1 (102), the recognition result 1 ((b) of drawing 2 ) is outputted to a selector 
105. Furthermore, CI ((a) of drawing 2 ) is outputted to the judgment section 104 whenever [ recognition confidence / 
which shows the probability of the recognition result 1 ]. 

[0026] 103 is the recognition section 2 which performs English character recognition, and recognizes the image data 
divided by the recognition unit logging section 101 for every recognition unit. Moreover, as a result of having been 
recognized in the recognition section 2 (103), the recognition result 2 ((c) of drawing. 2 ) is outputted to a selector 105. 
104 is the judgment section, inputs CI ((a) of drawing 2 ) whenever [ recognition confidence / which was outputted 
by the recognition section 1 (102) ], and outputs a 1 -bit judgment signal to a selector 105 based on the value. In 
addition, the judgment signal outputted in the judgment section 104 compares CI with the predetermined threshold 
Tl for example, whenever [ recognition confidence ], whenever [ recognition confidence ], when CI is larger than a 
threshold Tl, it outputs "1", and when small, it outputs "0." Moreover, 1 bit of high orders of C is taken whenever 
[ confidence ], and it is good also considering it as a judgment signal. 

[0027] 105 is a selector and the judgment signal judged by the recognition result 2 and the judgment section 104 
which have been recognized by the recognition result 1 recognized by the recognition section 1 (102) and the 
recognition section 2 (103) is inputted. Based on the judgment signal inputted from the judgment section 104, either 
the recognition result 1 or the recognition result 2 is chosen, and it is outputted as a recognition result. 
[0028] Next, the processing in the gestalt 1 of operation is explained using the flow chart of drawing 3 . Drawing 3 is 
a flow chart which shows the processing flow of the gestalt 1 of operation. In addition, let the recognition unit divided 
into the recognition unit logging section 101 be a "line" with the gestalt 1 of operation. 

[0029] At step SI 101, the image data read from the read station 1 18 is divided into a block as shown below by field 
separation processing. For example, the image data read from the read station 1 18 presupposes that it was a thing as 
shown in (a) of drawing 4 . The processing which performs a class division as this image data is shown in (b) of 
drawing 4 according to the class (for example, a title, a text, drawing, a picture) of image data is field separation. 
Here, the processing flow of field separation processing is explained using the flow chart of drawing 5 . 
[0030] At step S301, the read image data is received for every field of a mxn pixel unit, window ring processing is 
performed, and resolution conversion which makes resolution low is performed to the degree which the image data in 
the field connects in a field with at least 1 pixel of black dots. For example, if resolution conversion is performed to 
the image data shown in (a) of drawing 4 , it will become an image as shown in dra wing 6 . 
[0031] In addition, the field obtained by the isolated dot clearly understood to be a noise is eliminated with pattern 
matching etc. for example, after window ring processing. At step S3 02, the border-line trace for grasping the feature 
of the border line of each image is performed to the image by which resolution conversion was carried out. By this 
processing, when the feature of a border line is a long and slender pattern, it is judged with a text or a title, and when 
other, it can distinguish as a graphic form or a picture. 

[0032] At step S303, when the image of the same class adjoins, connection processing which connects adjoining 
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images is performed. By this processing, field separation processing of drawing 4 as shown in (b) completes the 
image data of (a) of drawing 4 . In addition, the block data which shows the arrangement relation of the data structure 
which shows the class of image data as shown in (a) of drawing 7 at the time of this completion of field separation 
processing etc., and the block by which field separation was carried out as shown in (b) of drawing 7 is obtained. 
Moreover, according to this data structure and block data, the information in connection with processing of the 
sequence which character recognition recognizes is acquired suitably. 

[0033] The data structure of (a) of drawing 7 and the block data of (b) are explained. First, the data structure shown 
by (a) of drawing 7 is defining the address (strucut BLOCK *next_address) which determines the sequence of the 
class (short type of drawing) of block (struct BLOCK), physical relationship (short startx of drawing, starty), the 
width of face of each block and height (short width of drawing, height), and each block divided according to field 
separation. 

[0034] With drawing, it is defined as 0=TYTLE (title), 1=TEXT (text), 2=FIGURE (drawing), and 3=PICTURE 
(picture) as a class of block. The numeric character defined as each block is used with block data as a class of block 
determined according to field separation. Moreover, the physical relationship of each block is the physical 
relationship over x and a y-coordinate as shown in (b) of drawing 4 , startx is the x-coordinate of each block and 
starty is the y-coordinate of each block. In addition, the location of each block is made into the location of each block 
of the angle (angle shown by - mark of (b) of drawin g 4 ) which has the position relation of each block. Furthermore, 
the width of face of each block and height are x lay length (width of face defined by width) of each block, and are y 
lay length (height defined by height). 

[0035] Since the image data shown by (a) of drawing 4 is divided into five blocks (refer to (b) of drawing 4 ) by field 
separation as shown in (b) of drawing 5 , BLK[0] -BLK[4] is given as sequence (address) of processing of each block. 
And the class of block explained by (a) of drawing 7 , a coordinate, its width of face, height, and the address of the 
block which should be processed further next are shown by each block. 

[0036] For example, if BLK [0] explains, supposing the block shown by BLK [0] is the text of (b) of drawing 4 , type 
which is the class of block will be set to "0." Moreover, starty whose startx which is x-coordinates as a location is 
"500" and a y-coordinate is set to "300." Furthermore, height whose width(s) which are width of face are " 1 500" and 
height is set to "250. " Furthermore, next_address which is the address of the block which should be processed next 
becomes "&BLK[1] H again. Similarly, BLK[1] -BLK[4] is also explained. In addition, since the block which should 
be processed next does not exist about the block (drawing BLK [4]) processed at the end, next_address which is the 
address of the block which should be processed next serves as "NULL." 

[0037] It returns to explanation of the flow chart of drawing.3. . It judges whether the processing after step SI 103 
explained below at step 1 102 about each block divided according to field separation at step 1101 is completed. That 
is, processing is performed according to the sequence of each block, and when processing of a block in which 
next_address is "NULL" is completed, it means that the processing in all blocks was completed. 
[0038] It judges whether it is that the block which is going to carry out current processing contains the alphabetic 
character at step SI 103 (type is "0" or "1"). When the alphabetic character is included (it is YES at step SI 103), it 
progresses to step SI 104. When the alphabetic character is not included (it is NO at step SI 103), it returns to step 
SI 102. At step SI 104, the y-axis (line) in which projection is performed in the direction of the y-axis, and an 
alphabetic character exists the image data within a block (alphabetic character) is extracted. If it carries out to the text 
2 of (b) of drawing 4 as an example of the projection of direction HE of the y-axis, to the portion (line) which has an 
alphabetic character as shown in drawing 8 , projection will appear on y\ The projection method of concrete direction 
HE of the y-axis is explained using the flow chart of drawing 9 . 

[0039] Drawing 9 is a flow chart which shows the processing flow of the projection method of direction HE of the y- 
axis of the gestalt 1 of operation. In addition, since the alphabetic character which exists in each block is a 1-bit 
bitmapped image, the projection to the direction of the y-axis is "black" about the image within a block in 1 bitwise, 
or it is searched, judging "white", and is performed based on the judgment and retrieval result. Moreover, at the 
processing initiation time of the projection to the direction of the y-axis, initialization of a value explained at step 
S901 - step S905 is performed. 

[0040] nline which shows the line count which it has to the block in the block containing the image data of an 
alphabetic character at step S901 is reset. The flag flag which distinguishes whether certain it is and the end of a 
limping gait is searched for whether the beginning of a line is searched with step S902 is reset. At step S903, the 
counter j which counts the count of retrieval of the direction of y is reset. At step S904, the counter i which counts the 
count of retrieval of x directions is reset. The flag kuro which shows whether the pixel located in the value which 
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Counter i shows at step S905 is black is reset. 

[0041] At step S906, it judges whether the pixel located in a coordinate (startx+i, starty+j) is black. When a pixel is 
black (it is NO at step S906), it progresses to step S907. When a pixel is not black (it is NO at step S906) (i.e., when it 
is white), it progresses to step S91 1 . At step S91 1, startx+i judges whether it is larger than the width of face width of a 
block. When large (it is NO at step S91 1), it progresses to step S912. When small (it is YES at step S91 1), it 
progresses to step S916. At step S916, the value of i is incremented +one time and it returns to step S906. 
[0042] On the other hand, 1 is set to kuro at step S907. at step S908, it judges whether flag is "0" (it is the mode in 
which the beginning of a line is looked for - do thing or not?). When flag is "0" (it is YES at step S908), it progresses 
to step S909. When flag is not "0 M (it is NO at step S908), it progresses to step S912. 

[0043] At step S909, a y-coordinate assigns the value of j to line_sy [nline] which shows Rhine which is starty+j. flag 
is changed into "1" at step S910. At step S912, kuro judges whether "0" (there being no black pixel in the Rhine) and 
flag are "1" (mode in which the end of aline is looked for), kuro progresses to step S913, when "0" and flag are "1" (it 
is YES at step S912). kuro progresses to step S917, when "0" and flag are not "1" (it is NO at step S912). 
[0044] Start location line_sy [nline] of the line is substituted for lineji [nline] which shows the nline position row 
height extracted by projection at step S913 from j which is current processing Rhine. At step S914, flag which shows 
whether it is line sampling is returned to the mode in which the beginning of a line is looked for, +1 ink RIMETO of 
the nline which shows a line count at step S915 is carried out, and it flies to step S917. On the other hand, when "0" 
and flag are not "1" (it is NO at step S912), they fly to step S917 and increment the value of j +one time, and kuro 
returns to step S904. 

[0045] Processing of the above drawing 9 can extract a line count, the starting point of each line, and height from an 
alphabetic character image. In addition, although the extract method of a line explained with the flow chart of 
drawing was extracting the y-axis (line) in which projection is performed in the direction of the y-axis, and an 
alphabetic character exists the image data within a block (alphabetic character), it is not restricted to this. For 
example, although a long and slender thing is judged among the objects extracted by the border-line trace to be an 
alphabetic character after performing a border-line trace of step S3 02 of the flow chart of drawing. 5 , field separation 
is performed as a line at step S3 03, without performing same group association. If this processing is performed to the 
image data of (a) of drawing 4 , it will become like drawing 10 and a line will be extracted at the time of field 
separation. However, since the precision of the line extracted in the case of resolution conversion is low, the value of 
the predetermined threshold Tl in consideration of the precision must be set up. 

[0046] It returns to explanation of the flow chart of drawing 3 . It judges whether it is the no which recognition 
processing of each line extracted at step SI 104 ended at step SI 105. When recognition processing of all lines is 
completed (it is NO at step SI 105), it progresses to step SI 102. When having not ended (it is YES at step SI 105), it 
progresses to step SI 106. 

[0047] At step SI 106, by the recognition section 1, character recognition of a line unit is performed and CI is 
calculated whenever [ recognition result 1 and recognition confidence ]. In the recognition section 2, character 
recognition of a line unit is performed to coincidence, and it is asked for the recognition result 2. At step SI 108, CI 
judges whether it is larger than the predetermined threshold Tl whenever [ recognition confidence ]. Whenever 
[ recognition confidence ], when CI is larger than the predetermined threshold Tl (it is YES at step SI 108), it 
progresses to step SI 109. Whenever [ recognition confidence ], when CI is smaller than the predetermined threshold 
Tl (it is NO at step SI 108), it progresses to step SI 1 10. At step SI 109, the recognition result 1 is chosen as a 
recognition result. At step SI 1 10, the recognition result 2 is chosen as a selection result. As mentioned above, the 
character recognition of all the lines contained in a block is completed by performing recognition processing of step 
SI 106 -step SI 110 to all lines. 

[0048] As explained above, according to the gestalt 1 of operation, the character recognition section 1 for Japanese 
and the character recognition section 2 for English are formed, and recognition processing of a reading manuscript is 
performed in each recognition section in juxtaposition. And character recognition can be performed by determining a 
recognition result based on CI whenever [ recognition confidence / which can be found by the recognition section 1 ], 
without checking whether a reading manuscript is English or it is Japanese, before a user makes a read station 118 
read a reading manuscript. 

[0049] In addition, it is also easily possible to constitute from a read station 1 18 which does not depend in the 
direction of a reading manuscript at a read station 1 18 by detecting the direction of the image data at the time of 
reading of the image data which is a text or a title. As the detection method of the direction of image data, the line 
(image data) which performed 0-degree rotation, 90-degree rotation, 180-degree rotation, and 270-degree rotation for 
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the line (image data) which extracted a certain line (image data) and was extracted from the block which is the text or 
title extracted according to field separation is gained, for example. And it reads by the read station 1 18 to the image 
data gained, respectively. Consequently, what is necessary is to detect the direction of the image data which is a text 
or a title, and just to rotate a reading manuscript by obtaining whenever [ recognition / of each rotation to the read 
image data ], based on the detected direction. 

[0050] In addition, processing of the flow chart of drawing .3 of the gestalt 1 of operation may be hereafter performed 
by the processing to which it is explained with the flow chart of drawing 1 1 . In addition, in drawing 1 1 , since 
processing of step S1301 to the step S1305 is the same as processing of step SI 101 to the step SI 105 of drawing 1 1 , 
the explanation is omitted. 

[0051] At step S1306, by the recognition section 1, character recognition is carried out and CI is calculated whenever 
[ recognition result 1 and recognition confidence ]. At step S 1307, CI judges whether it is larger than the 
predetermined threshold Tl whenever [ recognition confidence ]. Whenever [ recognition confidence ], when CI is 
larger than the predetermined threshold Tl (it is YES at step S1307), it progresses to step S1308. Whenever 
[ recognition confidence ], when CI is smaller than the predetermined threshold Tl (it is NO at step S1307), it 
progresses to step S1309. At step S1308, the recognition result 1 is chosen as a recognition result. 
[0052] On the other hand, at step SI 309, by the recognition section 2, character recognition is carried out and C2 is 
calculated whenever [ recognition result 2 and recognition confidence ]. At step S1310, the recognition result 2 is 
chosen as a recognition result. As mentioned above, the character recognition of all the lines contained in a block is 
completed by performing recognition processing of step S1306 - step S1310 to all lines. By performing processing 
explained by drawing 1 1 , the character recognition in the recognition section 2 is step SI 307, and is performed only 
only within the case where CI is less than [ predetermined / threshold Tl ] whenever [ recognition confidence ]. 
Therefore, since it is not necessary to perform processing by the recognition section 2 when character recognition is 
completed only by the recognition processing depended recognition result 1, improvement in the speed of processing 
can be attained. 

[0053] In addition, with the gestalt 1 of operation, although CI was outputted whenever [ in the recognition section 

1 / recognition confidence ], it does not restrict to this. For example, as shown in drawing 12 , you may make it the 
configuration which outputs C2 ((d) of drawing) whenever [ recognition confidence ] by the recognition section 2. In 
this case, in the judgment section 104, a judgment signal is outputted to a selector 105 so that C2 may be compared 

[ whenever / recognition confidence ] whenever [ CI and recognition confidence ] and the higher one of whenever 
[ recognition confidence ] may be outputted as a recognition result. Consequently, the direction with a sufficient 
recognition result is chosen by the selector 105 as a recognition result. Moreover, what is necessary is just to compare 
the recognition result by each recognition section, after equipping the judgment section 104 with the transducer (un- 
illustrating) which makes both range or positioning equivalent and passing through processing of the transducer, 
when positioning differs from the range of whenever [ recognition confidence / of the recognition section 1 and the 
recognition section 2 ]. In addition, although character recognition was performed with the gestalt 1 of operation for 
each [ makes a line unit the character recognition unit of the block judged to be a title or a text in the recognition 
section, and are contained in the block ] line of every, it does not restrict to this. For example, the 1st line of the block 
judged to be a title or a text is extracted, and character recognition processing by the recognition section 1 and the 
recognition section 2 is performed to oneth of them. And if either the recognition section 1 or the recognition section 

2 is chosen as a recognition result, character recognition may be performed to all the image data remainder within the 
block using the selected recognition section. 

[0054] Moreover, the projection of direction HE of the y-axis extracts a line for the image data of the block judged to 
be a title or a text, and the projection of direction HE of a x axis extracts an alphabetic character. And arbitration 
extracts by m characters out of the block divided into the line and the alphabetic character, and character recognition 
processing by the recognition section 1 and the recognition section 2 is performed to the m extracted characters. And 
if either the recognition section 1 or the recognition section 2 is chosen as a recognition result, character recognition 
may be performed to all the image data remainder within the block using the selected recognition section. 
[0055] By making it such a configuration, the processing time concerning character recognition processing can be 
decreased. In addition, with the gestalt 1 of operation, although the image data which performs field separation 
processing by the recognition unit logging section 101 of drawing 2 was 1 -pixel a 1-bit image, it is not restricted to 
this. For example, field separation can also be performed about a 8-bit 1 -pixel multiple-value image. The field 
separation processing in this case is covering a differentiation filter to image data, extracts the high frequency 
component of image data, and performs field separation to text data and an image data from the obtained result. 
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Moreover, like the gestalt 1 of operation, when English and Japanese need to be distinguished, after using 8-bit 1- 
pixel multiple-value image data as binary-ized data on the basis of a fixed threshold, field separation processing is 
performed. 

[0056] With the gestalt 1 of the <gestalt 2 of operation> operation, it has the recognition section the object for 
Japanese, and for English, respectively, and although it was the configuration of performing character recognition of 
Japanese and English, character recognition of two or more language which is not restricted to Japanese and English 
can be performed to coincidence by extending and having each recognition section to two or more language. 
[0057] For example, as shown in drawing 13 , the recognition section 1 (102) to each language of n (n is positive 
integer) individual, the recognition section 2 (103), — , the recognition section n (106) are prepared. And the 
recognition result 1, the recognition result 2, --, the recognition result n are outputted to a selector 105 from each 
recognition section 1 (102) - the recognition section n (106). Moreover, Cn is outputted [ whenever / recognition 
confidence / whenever / CI and recognition confidence ] to the judgment section 104 whenever [ C2 — , and 
recognition confidence ]. The judgment section 104 compares Cl-Cn whenever [ recognition confidence / which was 
inputted ], and outputs the judgment signal of a log2 (n) bit to a selector 105 based on the result. And one selector 105 
is chosen from the recognition result 1 - n as a recognition result according to a judgment signal. 
[0058] For example, the character recognition processing in the case of consisting of the three recognition sections is 
explained using the flow chart of drawing 14 . Drawing 14 is a flow chart which shows the processing flow of the 
gestalt 2 of operation. At step SI 501, field separation depended for every recognition unit of image data in the 
recognition unit logging section 101 is performed. It judges whether at step SI 502, character recognition processing is 
completed about each block divided according to field separation at step 1501. Processing is ended when character 
recognition processing is not completed about each block (it is NO at step SI 502). When character recognition is not 
completed about each block (it is YES at step SI 502), it progresses to step SI 503. 

[0059] At step SI 503, by the recognition section 1, character recognition is carried out and CI is calculated whenever 
[ recognition result 1 and recognition confidence ]. At step SI 504, CI judges whether it is larger than the 
predetermined threshold Tl whenever [ recognition confidence ]. Whenever [ recognition confidence ], when CI is 
larger than the predetermined threshold Tl (it is YES at step S1504), it progresses to step S1505. Whenever 
[ recognition confidence ], when CI is smaller than the predetermined threshold Tl (it is NO at step SI 504), it 
progresses to step SI 506. At step SI 505, the recognition result 1 is chosen as a recognition result. 
[0060] At step SI 506, by the recognition section 2, character recognition is carried out and C2 is calculated whenever 
[ recognition result 2 and recognition confidence ]. At step SI 507, C2 judges whether it is larger than the 
predetermined threshold T2 whenever [ recognition confidence ]. Whenever [ recognition confidence ], when C2 is 
larger than the predetermined threshold T2 (it is YES at step SI 507), it progresses to step SI 508. Whenever 
[ recognition confidence ], when C2 is smaller than the predetermined threshold T2 (it is NO at step SI 507), it 
progresses to step SI 509. At step SI 508, the recognition result 1 is chosen as a recognition result. 
[0061] At step SI 509, by the recognition section 3, character recognition is carried out and the recognition result 3 is 
searched for. At step S1510, the recognition result 3 is chosen as a recognition result. As mentioned above, 
recognition processing of each block is completed by carrying out recognition processing of step SI 504 - step S1510 
to all blocks. In addition, with the gestalt 2 of operation, although the recognition section was made into three pieces, 
it does not restrict to this. According to the number of two or more language which wants to perform recognition, 
without limit is extensible. For example, if Tn costs whenever [ recognition result / of the recognition section n / Cn, 
and recognition confidence ] when it consists of the n recognition sections, if the conditions of Cn>=Tn fulfill, the 
processing by the recognition section after the recognition section (n+1) which does not choose and fulfill the 
recognition result n if it becomes will perform, and it will choose considering the recognition result of the recognition 
section which fulfills above-mentioned conditions in each recognition section as a recognition result. 
[0062] Moreover, although the sequence which the recognition section 1 - n make recognize shall follow in order of 
predetermined [ beforehand regular ], every time it responds to a user's use, it shall change the sequence through input 
units, such as a keyboard. As explained above, according to the gestalt 2 of operation, it becomes possible to perform 
character recognition of two or more language to coincidence by preparing the recognition section corresponding to 
two or more language. 

[0063] In addition, storages, such as FD which made the program of the gestalt of the above-mentioned operation 
memorize, can also attain the purpose of this invention attained by the function of Above FDD, or the function of a 
method. That is, it is because the program itself which equipped Above FDD with the storage and was read from the 
storage to it attains the new function of this invention. The program structure-feature concerning this invention for 
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this is as being shown in drawing 15 and drawing 16 . 

[0064] In order to realize control of the gestalt 1 of operation, five modules are consisted of by FD as shown in 
drawing 1 5 . As shown in drawing, it is five, the recognition module 101 1, the activation module 1012, the 
acquisition module 1013, the selection module 1014, and the output module 1015. According to the module 
memorized by this storage, processing is performed in order of the step SI 001 as shown in drawing J_5 - step SI 005, 
"recognition", "activation", "acquisition", "selection", and an "output." In the module memorized by each storage, 
"recognition" (step SI 001) processing performed by the recognition module 101 1 corresponds to step SI 101 of the 
flow chart of drawing 3 - step SI 105. 

[0065] "Activation" (step S1002) processing performed by the activation module 1012 and "acquisition" (step S1003) 
processing performed by the acquisition module 1013 correspond to step SI 106 of the flow chart of drawing 3 . 
"Selection" (step SI 004) processing performed by the selection module 1014 corresponds to step SI 108 of the flow 
chart of drawing. 3 . The "output" (step SI 005) processing performed by the output module 1014 corresponds to step 
SI 109 of the flow chart of drawing 3 , and step SI 1 10. 

[0066] Moreover, in order to realize control of the gestalt 2 of operation, six modules are consisted of by FD as shown 
in drawing 16 . As shown in drawing, it is six, the recognition module 201 1, the activation module 2012, the 
acquisition module 2013, the comparison module 2014, the output module 2015, and a control module 2016. 
According to the module memorized by this storage, processing is performed in order of the step S2001 as shown in 
drawingJ6 - step S2006, "recognition", "activation", "acquisition", "a comparison", an "output", and "control." In the 
module memorized by each storage, "recognition" (step S2001) processing performed by the recognition module 
201 1 corresponds to step SI 501 of the flow chart of drawing 14 , and step SI 502. 

[0067] "Activation" (step S2002) processing performed by the activation module 2012 and "acquisition" (step S2003) 
processing performed by the acquisition module 2013 correspond to step SI 503 of the flow chart of drawing 14 . 
"Selection" (step S2004) processing performed by the selection module 2014 corresponds to step SI 504 of the flow 
chart of drawing 14 . The "output" (step S2005) processing performed by the output module 2015 corresponds to step 
S1505 of the flow chart of dramngJ4. . "Control" (step S2006) processing performed with a control module 2016 
corresponds to step S 1506 of the flow chart of drawing 14 - step S 1 5 10. 

[0068] Moreover, even if it applies this invention to the system which consists of two or more devices, it may be 
applied to the equipment which consists of one device. Moreover, it cannot be overemphasized that this invention can 
be applied also when carrying out by supplying a program to a system or equipment. In this case, the storage which 
stored the program concerning this invention will constitute this invention. And the system or equipment operates by 
the method defined beforehand by reading the program from this storage to a system or equipment. 
[0069] 

[Effect of the Invention] According to this invention, the character reader which makes possible character recognition 
of the alphabetic character used for each of two or more language, and improves the processing speed of character 
recognition, and its method can be offered so that clearly also from the above explanation. 

[Translation done.] 
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* NOTICES * 

Japan Patent Office is not responsible for any 
damages caused by the use of this translation. 

1 .This document has been translated by computer. So the translation may not reflect the original precisely. 

2 **** s hows the word which can not be translated. 
3. In the drawings, any words are not translated. 
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[Drawing 7] 



(a) 



struct BLOCK { 

short type ;/*0=TYTL£,l = TEXT, 

2 = FIGURE, 3 = PICTURE*/ 

short startx.starty ; 

short widttuheight ; 

struct BLOCK *next_address ; 
} BLOCK ; 



(b) 

BLK [0].type=0 ; 

.startx=500 : 
.starty=300; 
.width- 1500; 
.height =250 ; 

.next_ address =& BLK [lj.type 
BLK [l].type = l ; 

.8tartx = 1400; 
.starty = 1000 ; 
.width =1450; 
.height =250 ; 

jiBxt_address=&BLK [2] .type 
BLK [2] .type -2 ; 

.start* =150 ; 
.starty= 1300 ; 
.width =1500; 
iieight =1400 ; 

jiext_address=&BLK [3] .type 
BLK [3j.type=l ; 

.start* =1700 ; 
.star ty= 1500 : 
.width =1000 ; 
.height =800 ; 

,next_address=&BLK [4]. type 
BLK [4].type=l ; 

.startxMOO ; 
.starty=2900; 
.width =2800; 
.height =1200; 
,next_address = NULL 
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